-
Notifications
You must be signed in to change notification settings - Fork 13.6k
expandFMINIMUMNUM_FMAXIMUMNUM: Quiet is not needed for NaN vs NaN #139237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-selectiondag Author: YunQiang Su (wzssyqa) ChangesNew LangRef doesn't requires quieting for NaN vs NaN, aka the result may be sNaN for sNaN vs NaN. Full diff: https://github.com/llvm/llvm-project/pull/139237.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index ba34c72156228..2c63c54fc03f7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -8683,11 +8683,6 @@ SDValue TargetLowering::expandFMINIMUMNUM_FMAXIMUMNUM(SDNode *Node,
SDValue MinMax =
DAG.getSelectCC(DL, LHS, RHS, LHS, RHS, IsMax ? ISD::SETGT : ISD::SETLT);
- // If MinMax is NaN, let's quiet it.
- if (!Flags.hasNoNaNs() && !DAG.isKnownNeverNaN(LHS) &&
- !DAG.isKnownNeverNaN(RHS)) {
- MinMax = DAG.getNode(ISD::FCANONICALIZE, DL, VT, MinMax, Flags);
- }
// Fixup signed zero behavior.
if (Options.NoSignedZerosFPMath || Flags.hasNoSignedZeros() ||
|
Missing test changes |
06f867c
to
8003083
Compare
OK to merge? @arsenm |
@arsenm ping |
12c5e7c
to
8e77aa2
Compare
; GFX8-NEXT: s_movk_i32 s4, 0x8000 | ||
; GFX8-NEXT: v_lshrrev_b32_e32 v4, 16, v3 | ||
; GFX8-NEXT: v_cndmask_b32_e32 v3, v1, v0, vcc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised how big the diff is here. I'm also surprised AMDGPU is going through this path for any type, bfloat should have promoted to float?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find the code that set FMAXIMUMNUM/FMINIMUMNUM to promoted for BF16 for AMD64.
I guess that it is another bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index ade88a16193b..5e4bd36f96d0 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -213,7 +213,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
ISD::FLOG10, ISD::FEXP, ISD::FEXP2, ISD::FEXP10,
ISD::FCEIL, ISD::FTRUNC, ISD::FRINT, ISD::FNEARBYINT,
ISD::FROUND, ISD::FROUNDEVEN, ISD::FFLOOR, ISD::FCANONICALIZE,
- ISD::SETCC}) {
+ ISD::SETCC, ISD::FMAXIMUMNUM,ISD::FMINIMUMNUM}) {
// FIXME: The promoted to type shouldn't need to be explicit
setOperationAction(Opc, MVT::bf16, Promote);
AddPromotedToType(Opc, MVT::bf16, MVT::f32);
@@ -776,6 +776,10 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
Vec16, Custom);
setOperationAction(ISD::INSERT_VECTOR_ELT, Vec16, Expand);
}
+ for (MVT Vec16 :
+ {MVT::v2bf16, MVT::v4bf16, MVT::v8bf16, MVT::v16bf16, MVT::v32bf16}) {
+ setOperationAction({ISD::FMAXIMUMNUM, ISD::FMINIMUMNUM}, Vec16, Promote);
+ }
}
if (Subtarget->hasVOP3PInsts()) {
I have a try with this patch. It seems making some difference. Since I don't understand AMDGPU well, I don't know whether it is correct.
We will use it to be sure that the canonicalize is removed in llvm#139237
We will use it to be sure that the canonicalize is removed in #139237
… (#141218) We will use it to be sure that the canonicalize is removed in llvm/llvm-project#139237
New LangRef doesn't requires quieting for NaN vs NaN, aka the result may be sNaN for sNaN vs NaN. See: llvm#139228
8e77aa2
to
20e5fa0
Compare
New LangRef doesn't requires quieting for NaN vs NaN, aka the result may be sNaN for sNaN vs NaN.
See: #139228